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Abstract — Suspended  sediment  estimation  is  important  to 
the  water  resources  management  and  water  quality 
problem.  In  this  article,  artificial  neural  networks  (ANN), 
M5tree  (M5T)  approaches  and  statistical  approaches 
such  as  Multiple  Linear  Regression  (MLR),  Sediment 
Rating  Curves  (SRC)  are  used  for  estimation  daily 
suspended  sediment  concentration  from  daily  temperature 
of  water  and  streamflow  in  river.  These  daily  datas  were 
measured  at  Iowa  station  in  US.  These  prediction 
aproaches  are  compared  to  each  other  according  to  three 
statistical  criteria,  namely,  mean  square  errors  (MSE), 
mean  absolute  relative  error  (MAE)  and  correlation 
coefficient  (R).  When  the  results  are  compared  ANN 
approach  have  better  forecasts  suspended  sediment  than 
the  other  estimation  methods. 

Keywords —  Suspended  Sediment ,  Artificial  Neural 
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I.  INTRODUCTION 

Daily  sediment  estimation  is  important  to  protect  of  the 
water  resources.  Measuring  sediment  load  of  rivers  is 
expensive  and  time  consuming.  River  flows  have 
measured  in  field  stations  but  there  isn’t  enough 
measurement  of  Suspended  Sediment.  In  recent  years, 
sediment  estimation  studies  have  been  made  to  develop 
sediment  rating  curve  (SRC),  regression  methods  and 
artificial  intelligence  techniques  for  simulation  processes 
with  limited  knowledge  of  the  physics.  Usually  in  most 
rivers,  sediments  are  mainly  transported  as  suspended 
sediment  load  [1].  Many  models  have  been  provide  to 
simulate  this  phenomenon.  However  traditional  sediment 
rating  curves  are  not  able  to  provide  sufficiently  accurate 
results.  Sediment  rating  curves  are  showed  a  relation 
between  the  sediment  and  river  discharges.  Such  a 
relationship  is  usually  established  by  a  regression 
analysis,  and  the  curves  are  generally  expressed  in  the 
form  of  a  power  equation.  McBean  and  Nassri  [2] 
examined  suspended  sediment  rating  curves  and  the 
practice  of  using  sediment  load  versus  discharge  is  shown 
to  be  misleading,  since  the  goodness  of  fit  implied  by  this 
relation  is  spurious. 


In  recent  years,  artificial  intelligence  approaches,  based 
on  learning  algoritms,  methods  of  artificial  neural 
networks  (ANN),  adaptive  neuro-fuzzy  (NF)  and  support 
vector  machines  (SVM)  have  been  widely  used  to  in 
water  resource  management  and  hydrological  projects 
[3,4,5,6,7,8,9,10,11,12].  Mustafa  et  al.  [13]  used  a 
multilayer  perceptron  feed  forward  neural  network  with 
different  algorithms  to  predict  the  suspended  sediment 
discharge  of  a  river  in  Peninsular,  Malaysia.  Demirci  and 
Baltaci  [14]  investigated  the  performance  of  the  sediment 
rating  curves  (SRC),  multiple  linear  regression  (MLR) 
and  fuzzy  logic  (FL)  for  suspended  sediment  prediction. 
Afan  et  al.  [15]  used  feed  forward  neural  network  and 
radial  basis  function  methods  for  sediment  estimation. 
Buyukyildiz  and  Kumcu  [16]  researched  to  viability 
artificial  intelligence  techniques  to  predict  of  the  sediment 
load  which  gauged  at  station  in  Turkey.  They  analyzed 
artificial  intelligence  methods  such  as  support  vector 
machine  (SVM),  artificial  neural  network  (ANN)  and 
adaptive  neural  fuzzy  inference  system  (ANFIS). 
According  to  the  their  model  results;  SVM,  ANN  and 
ANFIS  have  good  results  in  test  phase.  Nivesh  and 
Kumar  [17]  investigated  the  performance  evaluation  and 
validation  of  artificial  neural  network  (ANN),  and 
regression  models  for  predicting  sediment  load  from  the 
Vamsadhara  river  basin  in  south  India. 

II.  APPROACHES 

In  this  paper,  SRC,  MLR,  ANN,  M5tree  modeling 
approaches  are  utilized  for  forecasting  the  sediment  load 
to  compare  their  performances  in  modeling.  So  as  to 
forecast  sediment  concentration,  the  daily  streamflow, 
water  temperature  and  suspended  sediment  time  series 
data  belonging  to  one  station  in  USA  are  used. 

2.1.  Sediment  Rating  Curve  (SRC) 

A  sediment  rating  curve  (SRC)  associates  suspended 
sediment  concentration  in  a  river  with  stream  discharge. 
The  sediment  rating  curve  (SRC)  generally  represents  a 
functional  relationship  of  the  form 
S  =  a  Q  b  (1) 

in  which  Q  is  stream  discharge  (m3/s)  and  S  (mg/1)  is 
either  suspended  sediment  concentration  amount.  Values 
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of  a  and  b  constant  data  is  detected  via  a  linear  regression 
between  (log  S)  and  (log  Q). 

2.2.  Multiple  Linear  Regression  (MLR) 

Multiple  linear  regression  (MLR)  tries  to  determine  the 
relationship  between  two  or  more  variables  and  a 
response  variable  by  fitting  a  linear  equation  to  the 
measured  real  data.  If  y  dependent  variable  is  assumed  to 
be  affected  by  n  independent  variables  such  as  xi,  X2,. . .  xn 
and  a  MLR  equation  is 

y  =  b0+b1x1+b2X2+b3X3+...+bnxn  (2) 

In  multi  linear  regression  method,  bo,  bi,  hi,  b3....bn 
regression  coefficients  are  statistically  determined,  the 
equations  for  the  regression  coefficients  are  given  below. 

n 

E(x/-x)(y/-y) 


N  9 

(3) 

I  (Xi-x)2 

i= 1 

—h\  .x 

(4) 
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back-propagation  ANN  approach,  which  operates 
according  to  the  principle  of  back  propagation  of  errors. 

In  this  model,  an  artificial  neural  network  consists  of  the 
input  layer,  the  variable  weight  factors,  the  total  function, 
the  activation  function  and  the  output  layer  and  artificial 
neural  network  structures  with  three  (input,  hidden  and 
output)  layers  were  given  in  Fig.  1 . 


Here;  X  value  is  the  average  number  of  that  variable. 


2.3.  M5  Tree  (M5T) 

M5  approach  was  introduced  by  Quinlan  [18].  M5  is  a 
system  that  creates  tree-based  and  segmented  linear 
models.  This  model  involve  classification  which  generate 
decision  trees.  Model  tree  production  takes  place  in  these 
stages:  The  first  stage  involves  using  a  partitioning 
criterion  to  form  a  decision  tree.  The  partitioning  criterion 
for  the  M5  tree  approach  algorithm  is  based  on  the 
assumption  that  the  standard  deviation  of  the  values  of  a 
node  accessing  class  is  a  measure  of  the  error  in  that  node 
and  then  constructing  a  test  for  each  attribute  when 
computing  the  expected  decrease  in  this  error.  The 
formula  of  standard  deviation  reduction  (A)  given  below: 


A  =  sd(T)-ZTrsd(Ti) 


m 

M' 


(5) 


where  sd  is  symbolize  of  the  standard  deviation,  T  is  a  set 
of  instances  that  gets  at  the  node,  Ti  is  the  subset  of 
instances  that  have  the  ith  outcome  of  the  potential  set. 
[19].  After  all  possible  tests  have  been  obtained,  M5 
selects  the  test  which  maximizes  this  expected  "error 
reduction".  Readers  who  want  to  learn  more  about  the  M5 
model  tree,  can  examine  Quinlan  [18]. 

2.4.  Artificial  Neural  Networks  (ANN) 

Artificial  neural  networks  (ANN)  are  one  of  the 
computing  techniques  and  systems  that  able  to  derive  new 
information  through  learning  from  the  properties  of  the 
human  brain,  ability  to  create  and  discover  new 
information,  developed  with  the  aim  of  being  able  to 
perform  without  any  help.  Artificial  neural  networks; 
inspired  by  the  human  brain,  is  the  result  of  mathematical 
modeling  of  the  learning  process.  The  most  widely  used 
method  among  the  ANN  methods  is  the  feed-  forward- 
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Fig.  1.  ANN  structures  with  three  layers  (input,  hidden 
and  output  layers)  used  in  suspended  sediment  estimation 

According  to  Fig.l,  Wij;  Is  the  connection  weights 
between  the  input  and  the  hidden  layer  and  Wjk  is  the 
connection  weights  between  the  hidden  layer  and  the 
output  layer.  These  Wij  and  Wjk  values  are  coefficient 
values  that  express  the  effect  of  the  previous  input  data  on 
the  processed  element.  These  coefficients,  which  initially 
receive  random  weight  values,  change  constantly  by 
comparing  the  actual  output  values  with  the  outputs 
estimated  in  the  training  process.  Errors  until  they  reach 
their  minimum  link  weight  values,  errors  propagated 
backwards. 

Each  cell  in  the  hidden  and  output  layers  in  Fig.  1 .  allows 
the  data  from  the  previous  layer  to  enter  the  total  function 
(net).  This  function  calculates  the  net  input  to  the  cell  and 
determines  the  following  equation. 

N 

netpj-  Z  WijXpf  +  bj  (6) 

i  =  1 

In  equation  (6),  N  is  the  size  of  input  vector,  bj  is  the  bias 
term,  Wij  is  the  set  of  weights  between  i  and  j  layers,  Xi  is 
the  input  set  of  the  i-th  layer  for  the  p-th  instance.  The 
activation  function  generates  the  output  f  (net)  by  passing 
the  net  value  through  a  nonlinear  identification  function 
in  each  cell  of  the  j  and  k  layers.  One  of  the  most 
commonly  used  identification  functions  is  Sigmoid 
function.  Sigmoid  function  is  used  in  this  study  and  is 
expressed  as  in  equation  (7). 
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f(net)= - - 


l  +  e~ 


net 


(7) 


III.  APPROACH  RESULTS 

In  this  study,  it  was  investigated  all  viability  of 
approaches  at  sediment  prediction  in  river.  As  data, 
American  Geological  Research  Survey  (USGS) 
measurement  data  was  used.  A  total  of  700  daily  field 
data  were  used  for  estimation.  In  the  study,  the  data  is 
divided  into  two  parts  as  train  and  test  data.  %  70  part  of 
all  data  are  used  for  training  and  the  remaining  part  30% 
used  for  the  test  in  the  models. 

3.1.  Error  Analysis 

For  each  model,  statistical  parameters  such  as  mean 
square  error  (MSE),  mean  absolute  error  (MAE),  and 
correlation  coefficients  (R)  between  the  approach 
predictions  and  observations.  MSE  and  MAE  parameters 
were  determined  as  follows,  the  observed  values  are 
calculated.  These  parameters  results  are  used  to  compare 
the  performance  of  approach  estimation  and  the  observed 
values  are  calculated.  MSE  and  MAE  equations  were 
given  as  :. 

MSE  =  (Yi°“  “  ) 2  (8) 
MAE  =  lz|YIobserved-Yiestimate|  (9) 

Ni=il 

Where,  N  represents  number  of  output  used  and  Yi 
sediment  concentration  data  in  estimation. 

3.  2.  Sediment  Rating  Curve  (SRC)  Results 


Fig.  2.  Sediment  Rating  Curve  graph 


For  the  SRC  model,  the  streamflow  (Q)  were  used  as 
input  values.  The  conventional  SRC  which  is  formed 


between  streamflow  and  sediment  concentration  data, 
shown  in  Fig. 2.  SRC  distribution  and  scatter  graphs  based 
on  SRC  curve  results  are  shown  for  testing  data  in  Fig.  3. 
and  Fig.  4. 


Fig.  3.  Measurement  and  SRC  distribution  graph  for  test 
data 


When  distribution  graph  in  Fig.  3.  for  testing  data  are 
analyzed,  SRC  sediment  concentration  values  are  seen 
different  for  estimated  value  according  to  the  actual 
values.  The  correlation  coefficient  was  obtained  as  R  = 
0.5848.  Values  of  sediment  rating  curve  are  seen  to  be 
spaced  out  from  the  actual  values. 


Fig.  4.  Measurement  and  SRC  scatter  graph  for  test  data 

3.3.  Multiple  Linear  Regression  (MLR)  Results 

For  Multiple  linear  regressions  (MLR),  the  average  water 
temperature  (Tmean),  the  streamflow  (Q),  lagged  time  the 
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streamflow  (Qt-i,  at  time  t-l)and  the  lagged  time  sediment 
concentration  (St-i,at  time  t-1)  were  used  as  input  values. 


Fig.  5.  Measurement  and  MLR  distribution  graph  for  test 
data 


MLR  distribution  and  scatter  graphs  are  shown  for  testing 
data  in  Fig.  5.  and  Fig.  6.  The  correlation  coefficient  were 
obtained  as  R  =  0.8462  from  the  generated  graphic.  MLR 
estimation  values  in  test  phase  are  observed  and  daily 
real-time  suspended  sediment  concentration  values  is 
better  results  than  SRC  values,  the  good  estimated  results 
are  observed  according  to  the  actual  values.  In 
distribution  and  scatter  charts,  MLR  values  are  near  the 
actual  values. 


Fig.  6.  Measurement  and  MLR  scatter  graph  for  test  data 

3.4.  M5Tree  (M5T)  Results 

For  M5Tree  (M5T),  the  average  water  temperature 
(T  mean)?  the  streamflow  (Q),  lagged  time  the  streamflow 
(Qm,  at  time  t-l)and  the  lagged  time  sediment 
concentration  (St-i,at  time  t-1)  were  used  as  input  values. 


ISSN:  2349-6495(9)  /  2456-1908(0) 


Fig.  7.  Measurement  and  M5Tree  distribution  graph  for 
test  data 


M5T  distribution  and  scatter  graphs  are  shown  for  testing 
data  in  Fig.  7.  and  Fig.  8.  The  correlation  coefficient  were 
obtained  as  R  =  0.8486  from  the  generated  graphic.  M5T 
prediction  values  in  test  phase  are  observed  and  daily 
real-time  suspended  sediment  concentration  values  is 
better  results  than  SRC  prediction  values  and  the  good 
estimated  results  are  observed  according  to  the  actual 
values.  In  distribution  and  scatter  charts,  M5T  prediction 
values  are  near  the  actual  values. 


Fig.  8.  Measurement  and  M5Tree  scatter  graph  for  test 
data 


3.5.  Artificial  Neural  Networks  (ANN)  Results 

For  Artificial  Neural  Networks  (ANN),  the  average  water 
temperature  (T  mean)?  the  streamflow  (Q),  lagged  time  the 
streamflow  (Qt-i,  at  time  t-l)and  the  lagged  time  sediment 
concentration  (St-i?at  time  t-1)  were  used  as  input  values. 
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Fig.  9.  Measurement  and  ANN  distribution  graph  for  test 


0  500  1000  1500  2000  2500  3000 

S-Measurement  (ing/1) 

Fig.  1 0.  Measurement  and  ANN  scatter  graph  for  test 
data 

The  correlation  coefficient  R  =  0.8908  was  obtained  for 
the  graph  generated  for  the  test  with  the  ANN  approach 
results.  The  ANN  predictions  at  the  test  phase  show  good 
results  and  in  this  study,  ANN  predictions  slightly  better 
than  the  MLR  and  M5T  models  values  for  the  observed 
daily  real-time  sediment  concentrations.  It  is  seen  that 
ANN  models  have  low  error  rates  and  a  high  correlation 
when  a  general  evaluation  is  carried  out. 

3.6.  Approach  Results  and  Analyses 


Table.  1:  Comparison  of  approach  performances 


Approaches 

SRC 

MLR 

M5T 

ANN 

Approach 

Inputs 

Q, 

Q,Qm,T,  St-i 

Q,Qm,T, 

Sm 

Q,Qt-i,T,  St- 

i 

MSE 

117877.4 

82268.13 

60883.11 

45242.93 

MAE 

207.86 

144.22 

143.95 

134.80 

R 

0.5848 

0,8462 

0.8686 

0.8908 

MSE:  Mean  square  error;  MAE:  Mean  absolute  error  R: 

Correlation  coefficients 

The  results  of  SRC,  MLR,  M5T  and  ANN  for  the  models 
generated  are  as  follows.  500-daily  observations  data  used 
in  the  training  of  the  ANN  approach  were  also  trained  for 
MLR  and  M5T  methods  as  input.  Models  created  in  the 
second  step  were  applied  to  the  inputs  of  the  test  data 
generated  from  200  day  observations  and  the  results 
obtained  with  the  approach  were  compared  with  the 
measured  values.  The  results  obtained  from  these  studies 
are  given  in  Table  1.  above. 

The  approach  with  the  best  result  according  to  Table  1.  is 
MSE,  MAE  is  the  smallest,  R  is  the  approach  with  the 
largest  value.  According  to  MSE,  MAE  and  R,  the  SRC 
approach  (117877.4-207.86-0,5848)  has  the  lowest 
success  rate.  ANN  (45242.93-134.80-0,8908),  MLR 
(82268.13-144.22-0,8462)  and  M5T  (60883.11-143.95- 
0,8686)  approach  was  found  to  perform  better  than  the 
SRC  approach  at  all  performance  evaluations. 

The  predictions  of  suspended  sediment  show  that  the 
approach  accuracy  increases  with  different  input 
combinations.  Fig.  6.,  Fig.  8.  and  Fig.  10.  provides  the 
scatter  plots  of  the  observed  and  predicted  sediment 
amount  during  the  MLR,  M5T  and  ANN  test  periods.  As 
seen  from  Table  L,  MLR,  M5T  and  ANN  approach  has 
the  smallest  MSE-  MAE  and  the  highest  R  for  four-input 
combination  during  the  test  period.  But,  ANN  approach 
slightly  better  than  the  MLR  and  M5T  models  for 
forecasting  of  daily  real-time  sediment  concentrations. 

IV.  CONCLUSIONS 

In  this  study,  the  abilities  of  artificial  neural  networks 
(ANN),  M5Tree  (M5T)  models  and  statistical  approaches 
such  as  Multiple  Linear  Regression  (MLR),  Sediment 
Rating  Curves  (SRC)  methods  in  estimating  the  sediment 
concentration  were  investigated.  Average  water 
temperature,  daily  real-time  flow  rate,  sediment 
concentration  data  in  the  US  were  used. 

When  the  results  are  evaluated,  MLR,  M5T  and  ANN 
approach  has  the  smallest  MSE-  MAE  and  the  highest  R. 
But,  ANN  approach  slightly  better  than  the  MLR  and 
M5T  models  for  forecasting  of  daily  real-time  sediment 
concentrations.  The  worst  results  in  all  criteria  were 
obtained  in  the  classical  sediment  rating  curve  (SRC) 
method. 

Although  all  present  modeling  approaches  are  quite 
helpful  and  important  in  the  water  resources  management 
studies,  but  it  is  shown  in  this  paper  that  the  ANN  can  be 
a  viable  alternative  for  river  sediment  prediction  in  future 
research. 

ANN  approach  applications  developed  for  a  specific 
region  can  be  used  as  a  very  useful  method  for  predicting 
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sediment  concentration,  both  in  terms  of  the  level  of  error 
and  the  proximity  of  estimates  to  observed  values. 
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